Comparing Prosody Formalisms for Machine Learning
نویسنده
چکیده
We need to find the most suitable prosody formalism for the task of machine learning. The target application is a prosody generative module for text-to-speech synthesis. This module will learn prosody marks (parameters or symbols) from large corpora. Formalism we are looking for should be general, perceptually relevant, restorable, automatically obtained, objective and learnable. Main formalisms for the pitch description are briefly described and compared, namely Fujisaki model, ToBI, Intsint, Tilt and “Glissando threshold” adaptation. The most suitable method of pitch description for the task of machine learning is “Glissando threshold” adaptation with an additional simplification.
منابع مشابه
Prosody Modeling in Concept-to-Speech Generation: Methodological Issues
We explore three issues for the development of Concept-to-Speech (CTS) systems. We identify information available in a language generation system that has the potential to impact prosody; investigate the role played by different corpora in CTS prosody modeling; and explore different methodologies for learning how linguistic features impact prosody. Our major focus is on the comparison of two ma...
متن کاملUsing Machine Learning Algorithms for Automatic Cyber Bullying Detection in Arabic Social Media
Social media allows people interact to express their thoughts or feelings about different subjects. However, some of users may write offensive twits to other via social media which known as cyber bullying. Successful prevention depends on automatically detecting malicious messages. Automatic detection of bullying in the text of social media by analyzing the text "twits" via one of the machine l...
متن کاملComparing Bayesian Network Classifiers
In this paper, we empirically evaluate algorithms for learning four Bayesian network (BN) classifiers: Naïve-Bayes, tree augmented Naïve-Bayes (TANs), BN augmented NaïveBayes (BANs) and general BNs (GBNs), where the GBNs and BANs are learned using two variants of a conditional independence based BN-learning algorithm. Experimental results show the GBNs and BANs learned using the proposing learn...
متن کاملTransformation-based learning of danish stress assignment
In Danish, as in other languages, prosody assignment is fairly well described as a function of lexical and syntactic structure. So in principle, prosodic clue assignment should be open to machine learning techniques. This paper presents an experiment using transformation-based ML for unsupervised learning of Danish main stress assignment. The trained stress assigner is compared to the leading D...
متن کاملModeling Prosodic Structures in Linguistically Enriched Environments
A significant challenge in Text-to-Speech (TtS) synthesis is the formulation of the prosodic structures (phrase breaks, pitch accents, phrase accents and boundary tones) of utterances. The prediction of these elements robustly relies on the accuracy and the quality of error-prone linguistic procedures, such as the identification of the part-of-speech and the syntactic tree. Additional linguisti...
متن کامل